-
Notifications
You must be signed in to change notification settings - Fork 8
Scheduled Halo Exchange #980
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
cscs-ci run default |
|
cscs-ci run extra |
model/common/src/icon4py/model/common/decomposition/definitions.py
Outdated
Show resolved
Hide resolved
model/common/src/icon4py/model/common/decomposition/definitions.py
Outdated
Show resolved
Hide resolved
|
cscs-ci run default |
|
cscs-ci run dace |
|
cscs-ci run extra |
**NOTE:** This commit still follows the old nomoclature, where `None` means default stream. Most likely this will change such that `None` means "not using `schedule_*()` functions and another sigelton is used for it.
|
cscs-ci run default |
- There are now two protocols that describes how to extract the underlying address. They are probably at the wrong location. - `stream=None` no longer means "default stream" but is not equivalent to "do not use scheduled version". - To indicate the default stream the singelton `DefaultStream` is used. The `cupy.cuda.Stream.null` singelton was not used, because it would require that `cupy` is present. - However, use the default stream is still the default behaviour.
|
cscs-ci run default |
|
cscs-ci run dace |
|
cscs-ci run extra |
|
cscs-ci run default |
|
cscs-ci run dace |
|
cscs-ci run extra |
|
There is a failing in See this test PR: #982 |
|
cscs-ci run default |
|
cscs-ci run dace |
|
cscs-ci run default |
|
cscs-ci run dace |
|
cscs-ci run extra |
|
cscs-ci run distributed |
1 similar comment
|
cscs-ci run distributed |
|
cscs-ci run default |
|
cscs-ci run dace |
|
cscs-ci run extra |
|
cscs-ci run distributed |
|
cscs-ci run default |
|
cscs-ci run dace |
|
cscs-ci run extra |
|
cscs-ci run distributed |
|
cscs-ci run distributed |
|
Mandatory Tests Please make sure you run these tests via comment before you merge!
Optional Tests To run benchmarks you can use:
To run tests and benchmarks with the DaCe backend you can use:
To run test levels ignored by the default test suite (mostly simple datatest for static fields computations) you can use:
For more detailed information please look at CI in the EXCLAIM universe. |
|
cscs-ci run distributed |
|
cscs-ci run default |
|
cscs-ci run dace |
|
cscs-ci run extra |
|
cscs-ci run default |
|
cscs-ci run dace |
|
cscs-ci run extra |
This PR introduces the scheduled exchange feature from GHEX into ICON4Py.
These exchange allows to call the exchange function before all work has been completed, i.e. the exchange will wait until the previous work is done. A similar feature is the "scheduled wait", that allows to initiate the receive without the need to wait on its completion.
In addition to this the function also renamed the functions related to halo exchange:
exchange()was renamed tostart().wait()was renamed tofinish()(that might now return before the transfer has fully concluded).exchange_and_wait()was renamed toexchange().All of these functions now accepts the an argument called
stream, which defaults toDEFAULT_STREAM. It is indicate how synchronization with the stream should be performed.In case of
start()it means that the actual exchange should not start until all work previously submitted tostreamhas finished. Forfinish()it means that further work, submitted tostream, should not start until the exchange has ended. Forfinish()it is also possible to specifyBLOCK, which means thatfinish()waits until the transfer has fully finished.The orchestrator was not updated, but the change were made in such a way that it continues to work in diffusion, although using the original, blocking behaviour.
Note:
The CI fails for
cscs/extra, but it also does this for currentmain, see See this test PR: #982